Simplification and extension of non-periodic excitation source representations for high-quality speech manipulation systems
نویسندگان
چکیده
A systematic framework for non-periodic excitation source representation is proposed for high-quality speech manipulation systems such as TANDEM-STRAIGHT, which is basically a channel VOCODER. The proposed method consists of two subsystems for non-periodic components; a colored noise source and an event analyzer/generator. The colored noise source is represented by using a sigmoid model with non-linear level conversion. Two model parameters, boundary frequency and slope parameters, are estimated based on pitch range linear prediction combined with F0 adaptive temporal axis warping and those on the original temporal axis. The event subsystem detects events based on kurtosis of filtered speech signals. The proposed framework provides significant quality improvement for high-quality recorded speech materials.
منابع مشابه
Excitation source analysis for high-quality speech manipulation systems based on an interference-free representation of group delay with minimum phase response compensation
A group delay-based excitation source analysis and design method is introduced for extension of TANDEM-STRAIGHT, a speech analysis, modification and synthesis system. This introduction makes all components of the system be based on interference-free representations. They are power spectrum, instantaneous frequency and group delay representations. This unification has potential to solve the majo...
متن کاملTowards an improved modeling of the glottal source in statistical parametric speech synthesis
This paper proposes the use of the Liljencrants-Fant model (LFmodel) to represent the glottal source signal in HMM-based speech synthesis systems. These systems generally use a pulse train to model the periodicity of the excitation signal of voiced speech. However, this model produces a strong and uniform harmonic structure throughout the spectrum of the excitation which makes the synthetic spe...
متن کاملUniform concatenative excitation model for synthesising speech without voiced/unvoiced classification
In general, speech synthesis using the source-filter model of speech production requires the classification of speech into two classes (voiced and unvoiced) which is prone to errors. For voiced speech, the input of the synthesis filter is an approximately periodic excitation, whereas it is a noise signal for unvoiced. This paper proposes an excitation model which can be used to synthesise both ...
متن کاملCombination Resonance of Nonlinear Rotating Balanced Shafts Subjected to Periodic Axial Load
Dynamic behavior of a circular shaft with geometrical nonlinearity and constant spin, subjected to periodic axial load is investigated. The case of parametric combination resonance is studied. Extension of shaft center line is the source of nonlinearity. The shaft has gyroscopic effect and rotary inertia but shear deformation is neglected. The equations of motion are derived by extended Hamilto...
متن کاملSynchronous overlap and add of spectra for enhancement of excitation in artificial bandwidth extension of speech
In this paper, a new approach that extends narrow-band excitation signals using synchronous overlap and add (SOLA) of spectra have been proposed. Although artificial bandwidth extension (ABE) of speech has been extensively studied, the role of excitation spectra has not been as widely studied as the spectral envelope extension. In this study ABE is investigated with the widely used source-filte...
متن کامل